Skip to content

Conversation

@HannesWell
Copy link
Member

@HannesWell HannesWell commented Oct 23, 2025

In PDE we recently had issued with tests waiting forever which lead to builds that ran for a long time until they hit the general build timeout of multiple hours.
While this limit can of course be specified only in the PDE root pom, I think it makes sense to have a general limit by default. My proposal is to set it to 20min per-test project. This is more than sufficient for most test projects, which just run a few minutes and still would give relative fast feedback if something is flawed and hangs.

After a quick search I identified the following projects for which the limit might be too strict (in general or on slow computers):
org.eclipse.jdt.core.tests.compiler running ~45min
org.eclipse.jdt.core.tests.model running ~15min
org.eclipse.equinox.p2.tests running ~15min
org.eclipse.osgi.tests running ~10min

there we could just overwrite the general timeout and set it to a more suitable value by a similar property definition in the pom.xml or by using a corresponding pom.model.property.surefire.timeout entry in the build.properties (for already pomless projects).

What do the platform and Equinox commiters think?
@stephan-herrmann, @jarthana, @mpalat what do you think about this for JDT?

@stephan-herrmann
Copy link
Contributor

For JDT I don't see a need to change anything, but if the end result is similar to the current situation then I have no objections, of course. In particular I don't have any data about how times per test project deviate from an average.

In case we get undesired timeouts for a particular build job, can the job override the timeout value of test projects?

@stephan-herrmann
Copy link
Contributor

I'm seeing strange timeouts in jdt.core PR builds these days. Could those be related to changes in platform?

Have a look at https://ci.eclipse.org/jdt/job/eclipse.jdt.core-Github/job/PR-4573/

Executions 2 and 3 claim to have hit a timeout after 90 min

In both cases things go wrong in this phase:

23:01:05  [INFO] --- tycho-packaging:5.0.1-SNAPSHOT:package-plugin (default-package-plugin) @ org.eclipse.jdt.core.tests.model ---
23:01:05  [INFO] Building jar: /home/jenkins/agent/workspace/eclipse.jdt.core-Github_PR-4573/org.eclipse.jdt.core.tests.model/target/org.eclipse.jdt.core.tests.model-3.13.600-SNAPSHOT.jar
23:02:02  
23:02:02  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-7-SelectorManager"
23:02:02  
23:02:02  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-8-SelectorManager"
23:02:02  
23:02:02  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-1-SelectorManager"
23:02:14  
23:02:14  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-10-SelectorManager"
23:02:22  
23:02:22  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-6-SelectorManager"
23:02:25  
23:02:25  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-9-SelectorManager"
23:02:26  
23:02:26  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-2-SelectorManager"
23:02:30  
23:02:30  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Worker-0"
23:02:32  
23:02:32  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-5-SelectorManager"
23:02:37  
23:02:37  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-4-SelectorManager"
23:02:39  
23:02:39  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Worker-1"
23:02:54  
23:02:54  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HttpClient-3-SelectorManager"
23:03:06  
23:03:06  Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"
  Cancelling nested steps due to timeout
23:40:39  Sending interrupt signal to process
23:40:49  OpenJDK 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
23:40:49  OpenJDK 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
23:40:49  script returned exit code 143

I read this as

  • at 23:02 we run out of memory (while building a simple jar, which shouldn't be a memory intensive operation)
  • at 23:40 the VM still hasn't shut down, assumeably exception handlers have gone wild for 38 minutes
    • this is when SIGTERM is send at the OS level?
  • the last lines look like another OOME occurred as a consequence of SIGTERM

So, perhaps timeout is the secondary thing here, but I haven't seen this kind of problem before, and now it happened in pretty much the same way twice in a row!

@stephan-herrmann
Copy link
Contributor

So, perhaps timeout is the secondary thing here, but I haven't seen this kind of problem before, and now it happened in pretty much the same way twice in a row!

Right, this observation belongs to #3436

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants